Dataset statistics
| Number of variables | 20 |
|---|---|
| Number of observations | 327346 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 136.0 MiB |
| Average record size in memory | 435.8 B |
Variable types
| NUM | 14 |
|---|---|
| CAT | 6 |
Reproduction
| Analysis started | 2020-04-20 02:20:17.871395 |
|---|---|
| Analysis finished | 2020-04-20 02:51:09.057741 |
| Version | pandas-profiling v2.5.0 |
| Command line | pandas_profiling --config_file config.yaml [YOUR_FILE.csv] |
| Download configuration | config.yaml |
tailnum has a high cardinality: 4037 distinct values | High cardinality |
dest has a high cardinality: 104 distinct values | High cardinality |
time_hour has a high cardinality: 6922 distinct values | High cardinality |
sched_dep_time is highly correlated with dep_time and 1 other fields | High Correlation |
dep_time is highly correlated with sched_dep_time and 1 other fields | High Correlation |
arr_delay is highly correlated with dep_delay | High Correlation |
dep_delay is highly correlated with arr_delay | High Correlation |
distance is highly correlated with air_time | High Correlation |
air_time is highly correlated with distance | High Correlation |
hour is highly correlated with dep_time and 1 other fields | High Correlation |
time_hour only contains datetime values, but is categorical. Consider applying pd.to_datetime() | Type |
dep_delay has 16466 (5.0%) zeros | Zeros |
arr_delay has 5409 (1.7%) zeros | Zeros |
minute has 58924 (18.0%) zeros | Zeros |
| Distinct count | 327346 |
|---|---|
| Unique (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 168190.6378 |
|---|---|
| Minimum | 0 |
| Maximum | 336769 |
| Zeros | 1 |
| Zeros (%) | < 0.1% |
| Memory size | 2.5 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 16581.25 |
| Q1 | 83007.25 |
| median | 168251.5 |
| Q3 | 252782.75 |
| 95-th percentile | 320245.75 |
| Maximum | 336769 |
| Range | 336769 |
| Interquartile range (IQR) | 169775.5 |
Descriptive statistics
| Standard deviation | 97510.31438 |
|---|---|
| Coefficient of variation (CV) | 0.5797606553 |
| Kurtosis | -1.205125435 |
| Mean | 168190.6378 |
| Median Absolute Deviation (MAD) | 84458.5635 |
| Skewness | 0.002508376679 |
| Sum | 5.505653252e+10 |
| Variance | 9508261411 |
| Value | Count | Frequency (%) | |
| 2047 | 1 | < 0.1% | |
| 166660 | 1 | < 0.1% | |
| 328437 | 1 | < 0.1% | |
| 334582 | 1 | < 0.1% | |
| 332535 | 1 | < 0.1% | |
| 174848 | 1 | < 0.1% | |
| 172801 | 1 | < 0.1% | |
| 178946 | 1 | < 0.1% | |
| 176899 | 1 | < 0.1% | |
| 164613 | 1 | < 0.1% | |
| Other values (327336) | 327336 | > 99.9% |
| Value | Count | Frequency (%) | |
| 0 | 1 | < 0.1% | |
| 1 | 1 | < 0.1% | |
| 2 | 1 | < 0.1% | |
| 3 | 1 | < 0.1% | |
| 4 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 336769 | 1 | < 0.1% | |
| 336768 | 1 | < 0.1% | |
| 336767 | 1 | < 0.1% | |
| 336766 | 1 | < 0.1% | |
| 336765 | 1 | < 0.1% |
| Distinct count | 1 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.5 MiB |
| 2013 |
|---|
| Value | Count | Frequency (%) | |
| 2013 | 327346 | 100.0% |
Length
| Max length | 4 |
|---|---|
| Mean length | 4 |
| Min length | 4 |
| Value | Count | Frequency (%) | |
| Decimal_Number | 4 | 100.0% |
| Value | Count | Frequency (%) | |
| Common | 4 | 100.0% |
| Value | Count | Frequency (%) | |
| ASCII | 4 | 100.0% |
month
Real number (ℝ≥0)
| Distinct count | 12 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 6.564802991 |
|---|---|
| Minimum | 1 |
| Maximum | 12 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 2.5 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 4 |
| median | 7 |
| Q3 | 10 |
| 95-th percentile | 12 |
| Maximum | 12 |
| Range | 11 |
| Interquartile range (IQR) | 6 |
Descriptive statistics
| Standard deviation | 3.413444381 |
|---|---|
| Coefficient of variation (CV) | 0.5199614346 |
| Kurtosis | -1.188328477 |
| Mean | 6.564802991 |
| Median Absolute Deviation (MAD) | 2.95801638 |
| Skewness | -0.02362709289 |
| Sum | 2148962 |
| Variance | 11.65160254 |
| Value | Count | Frequency (%) | |
| 8 | 28756 | 8.8% | |
| 10 | 28618 | 8.7% | |
| 7 | 28293 | 8.6% | |
| 5 | 28128 | 8.6% | |
| 3 | 27902 | 8.5% | |
| 4 | 27564 | 8.4% | |
| 6 | 27075 | 8.3% | |
| 12 | 27020 | 8.3% | |
| 9 | 27010 | 8.3% | |
| 11 | 26971 | 8.2% | |
| Other values (2) | 50009 | 15.3% |
| Value | Count | Frequency (%) | |
| 1 | 26398 | 8.1% | |
| 2 | 23611 | 7.2% | |
| 3 | 27902 | 8.5% | |
| 4 | 27564 | 8.4% | |
| 5 | 28128 | 8.6% |
| Value | Count | Frequency (%) | |
| 12 | 27020 | 8.3% | |
| 11 | 26971 | 8.2% | |
| 10 | 28618 | 8.7% | |
| 9 | 27010 | 8.3% | |
| 8 | 28756 | 8.8% |
day
Real number (ℝ≥0)
| Distinct count | 31 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 15.74082469 |
|---|---|
| Minimum | 1 |
| Maximum | 31 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 2.5 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 2 |
| Q1 | 8 |
| median | 16 |
| Q3 | 23 |
| 95-th percentile | 29 |
| Maximum | 31 |
| Range | 30 |
| Interquartile range (IQR) | 15 |
Descriptive statistics
| Standard deviation | 8.777376041 |
|---|---|
| Coefficient of variation (CV) | 0.5576185627 |
| Kurtosis | -1.185600327 |
| Mean | 15.74082469 |
| Median Absolute Deviation (MAD) | 7.58678047 |
| Skewness | -0.001125534436 |
| Sum | 5152696 |
| Variance | 77.04233016 |
| Value | Count | Frequency (%) | |
| 15 | 11150 | 3.4% | |
| 18 | 11131 | 3.4% | |
| 3 | 11070 | 3.4% | |
| 21 | 11017 | 3.4% | |
| 22 | 10985 | 3.4% | |
| 11 | 10983 | 3.4% | |
| 20 | 10974 | 3.4% | |
| 17 | 10961 | 3.3% | |
| 4 | 10949 | 3.3% | |
| 27 | 10845 | 3.3% | |
| Other values (21) | 217281 | 66.4% |
| Value | Count | Frequency (%) | |
| 1 | 10748 | 3.3% | |
| 2 | 10524 | 3.2% | |
| 3 | 11070 | 3.4% | |
| 4 | 10949 | 3.3% | |
| 5 | 10609 | 3.2% |
| Value | Count | Frequency (%) | |
| 31 | 6038 | 1.8% | |
| 30 | 10023 | 3.1% | |
| 29 | 9916 | 3.0% | |
| 28 | 10394 | 3.2% | |
| 27 | 10845 | 3.3% |
| Distinct count | 1317 |
|---|---|
| Unique (%) | 0.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1348.789883 |
|---|---|
| Minimum | 1 |
| Maximum | 2400 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 2.5 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 624 |
| Q1 | 907 |
| median | 1400 |
| Q3 | 1744 |
| 95-th percentile | 2112 |
| Maximum | 2400 |
| Range | 2399 |
| Interquartile range (IQR) | 837 |
Descriptive statistics
| Standard deviation | 488.3199792 |
|---|---|
| Coefficient of variation (CV) | 0.3620430324 |
| Kurtosis | -1.089029272 |
| Mean | 1348.789883 |
| Median Absolute Deviation (MAD) | 423.7592117 |
| Skewness | -0.02340292549 |
| Sum | 441520973 |
| Variance | 238456.4021 |
| Value | Count | Frequency (%) | |
| 555 | 833 | 0.3% | |
| 556 | 817 | 0.2% | |
| 755 | 816 | 0.2% | |
| 557 | 798 | 0.2% | |
| 655 | 793 | 0.2% | |
| 1455 | 767 | 0.2% | |
| 1454 | 766 | 0.2% | |
| 654 | 745 | 0.2% | |
| 855 | 740 | 0.2% | |
| 756 | 739 | 0.2% | |
| Other values (1307) | 319532 | 97.6% |
| Value | Count | Frequency (%) | |
| 1 | 25 | < 0.1% | |
| 2 | 35 | < 0.1% | |
| 3 | 26 | < 0.1% | |
| 4 | 26 | < 0.1% | |
| 5 | 20 | < 0.1% |
| Value | Count | Frequency (%) | |
| 2400 | 29 | < 0.1% | |
| 2359 | 54 | < 0.1% | |
| 2358 | 76 | < 0.1% | |
| 2357 | 74 | < 0.1% | |
| 2356 | 74 | < 0.1% |
| Distinct count | 1020 |
|---|---|
| Unique (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1340.335098 |
|---|---|
| Minimum | 500 |
| Maximum | 2359 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 2.5 MiB |
Quantile statistics
| Minimum | 500 |
|---|---|
| 5-th percentile | 630 |
| Q1 | 905 |
| median | 1355 |
| Q3 | 1729 |
| 95-th percentile | 2050 |
| Maximum | 2359 |
| Range | 1859 |
| Interquartile range (IQR) | 824 |
Descriptive statistics
| Standard deviation | 467.4131564 |
|---|---|
| Coefficient of variation (CV) | 0.3487285807 |
| Kurtosis | -1.198520237 |
| Mean | 1340.335098 |
| Median Absolute Deviation (MAD) | 407.089483 |
| Skewness | 0.006235546313 |
| Sum | 438753333 |
| Variance | 218475.0588 |
| Value | Count | Frequency (%) | |
| 600 | 6836 | 2.1% | |
| 700 | 4822 | 1.5% | |
| 630 | 4690 | 1.4% | |
| 900 | 4666 | 1.4% | |
| 1200 | 4521 | 1.4% | |
| 1700 | 4380 | 1.3% | |
| 1600 | 3971 | 1.2% | |
| 800 | 3862 | 1.2% | |
| 1300 | 3573 | 1.1% | |
| 1900 | 3544 | 1.1% | |
| Other values (1010) | 282481 | 86.3% |
| Value | Count | Frequency (%) | |
| 500 | 340 | 0.1% | |
| 501 | 1 | < 0.1% | |
| 505 | 2 | < 0.1% | |
| 510 | 5 | < 0.1% | |
| 515 | 205 | 0.1% |
| Value | Count | Frequency (%) | |
| 2359 | 810 | 0.2% | |
| 2358 | 44 | < 0.1% | |
| 2355 | 73 | < 0.1% | |
| 2352 | 16 | < 0.1% | |
| 2345 | 1 | < 0.1% |
| Distinct count | 526 |
|---|---|
| Unique (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 12.55515571 |
|---|---|
| Minimum | -43 |
| Maximum | 1301 |
| Zeros | 16466 |
| Zeros (%) | 5.0% |
| Memory size | 2.5 MiB |
Quantile statistics
| Minimum | -43 |
|---|---|
| 5-th percentile | -9 |
| Q1 | -5 |
| median | -2 |
| Q3 | 11 |
| 95-th percentile | 88 |
| Maximum | 1301 |
| Range | 1344 |
| Interquartile range (IQR) | 16 |
Descriptive statistics
| Standard deviation | 40.06568759 |
|---|---|
| Coefficient of variation (CV) | 3.19117409 |
| Kurtosis | 44.35504307 |
| Mean | 12.55515571 |
| Median Absolute Deviation (MAD) | 23.07396344 |
| Skewness | 4.818017945 |
| Sum | 4109880 |
| Variance | 1605.259322 |
| Value | Count | Frequency (%) | |
| -5 | 24765 | 7.6% | |
| -4 | 24557 | 7.5% | |
| -3 | 24158 | 7.4% | |
| -2 | 21463 | 6.6% | |
| -6 | 20649 | 6.3% | |
| -1 | 18761 | 5.7% | |
| -7 | 16714 | 5.1% | |
| 0 | 16466 | 5.0% | |
| -8 | 11770 | 3.6% | |
| 1 | 8026 | 2.5% | |
| Other values (516) | 140017 | 42.8% |
| Value | Count | Frequency (%) | |
| -43 | 1 | < 0.1% | |
| -33 | 1 | < 0.1% | |
| -32 | 1 | < 0.1% | |
| -30 | 1 | < 0.1% | |
| -27 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 1301 | 1 | < 0.1% | |
| 1137 | 1 | < 0.1% | |
| 1126 | 1 | < 0.1% | |
| 1014 | 1 | < 0.1% | |
| 1005 | 1 | < 0.1% |
arr_time
Real number (ℝ≥0)
| Distinct count | 1410 |
|---|---|
| Unique (%) | 0.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1501.908238 |
|---|---|
| Minimum | 1 |
| Maximum | 2400 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 2.5 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 736 |
| Q1 | 1104 |
| median | 1535 |
| Q3 | 1940 |
| 95-th percentile | 2248 |
| Maximum | 2400 |
| Range | 2399 |
| Interquartile range (IQR) | 836 |
Descriptive statistics
| Standard deviation | 532.8887311 |
|---|---|
| Coefficient of variation (CV) | 0.3548077823 |
| Kurtosis | -0.1946780458 |
| Mean | 1501.908238 |
| Median Absolute Deviation (MAD) | 447.5545315 |
| Skewness | -0.4656925901 |
| Sum | 491643654 |
| Variance | 283970.3997 |
| Value | Count | Frequency (%) | |
| 1008 | 484 | 0.1% | |
| 1013 | 484 | 0.1% | |
| 1015 | 479 | 0.1% | |
| 1012 | 464 | 0.1% | |
| 1005 | 460 | 0.1% | |
| 1016 | 459 | 0.1% | |
| 1006 | 459 | 0.1% | |
| 1011 | 457 | 0.1% | |
| 1007 | 456 | 0.1% | |
| 1040 | 455 | 0.1% | |
| Other values (1400) | 322689 | 98.6% |
| Value | Count | Frequency (%) | |
| 1 | 201 | 0.1% | |
| 2 | 163 | < 0.1% | |
| 3 | 174 | 0.1% | |
| 4 | 172 | 0.1% | |
| 5 | 205 | 0.1% |
| Value | Count | Frequency (%) | |
| 2400 | 150 | < 0.1% | |
| 2359 | 221 | 0.1% | |
| 2358 | 187 | 0.1% | |
| 2357 | 207 | 0.1% | |
| 2356 | 201 | 0.1% |
sched_arr_time
Real number (ℝ≥0)
| Distinct count | 1162 |
|---|---|
| Unique (%) | 0.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1532.788426 |
|---|---|
| Minimum | 1 |
| Maximum | 2359 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 2.5 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 815 |
| Q1 | 1122 |
| median | 1554 |
| Q3 | 1944 |
| 95-th percentile | 2246 |
| Maximum | 2359 |
| Range | 2358 |
| Interquartile range (IQR) | 822 |
Descriptive statistics
| Standard deviation | 497.9791245 |
|---|---|
| Coefficient of variation (CV) | 0.3248844499 |
| Kurtosis | -0.384548899 |
| Mean | 1532.788426 |
| Median Absolute Deviation (MAD) | 423.8191843 |
| Skewness | -0.3444786467 |
| Sum | 501752160 |
| Variance | 247983.2084 |
| Value | Count | Frequency (%) | |
| 1025 | 1294 | 0.4% | |
| 2015 | 1201 | 0.4% | |
| 1110 | 1191 | 0.4% | |
| 1115 | 1163 | 0.4% | |
| 1235 | 1119 | 0.3% | |
| 2359 | 1091 | 0.3% | |
| 1815 | 1064 | 0.3% | |
| 1015 | 1057 | 0.3% | |
| 1220 | 1056 | 0.3% | |
| 1310 | 1047 | 0.3% | |
| Other values (1152) | 316063 | 96.6% |
| Value | Count | Frequency (%) | |
| 1 | 235 | 0.1% | |
| 2 | 92 | < 0.1% | |
| 3 | 158 | < 0.1% | |
| 4 | 103 | < 0.1% | |
| 5 | 82 | < 0.1% |
| Value | Count | Frequency (%) | |
| 2359 | 1091 | 0.3% | |
| 2358 | 481 | 0.1% | |
| 2357 | 345 | 0.1% | |
| 2356 | 460 | 0.1% | |
| 2355 | 329 | 0.1% |
| Distinct count | 577 |
|---|---|
| Unique (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 6.895376757 |
|---|---|
| Minimum | -86 |
| Maximum | 1272 |
| Zeros | 5409 |
| Zeros (%) | 1.7% |
| Memory size | 2.5 MiB |
Quantile statistics
| Minimum | -86 |
|---|---|
| 5-th percentile | -32 |
| Q1 | -17 |
| median | -5 |
| Q3 | 14 |
| 95-th percentile | 91 |
| Maximum | 1272 |
| Range | 1358 |
| Interquartile range (IQR) | 31 |
Descriptive statistics
| Standard deviation | 44.63329169 |
|---|---|
| Coefficient of variation (CV) | 6.472930089 |
| Kurtosis | 29.233044 |
| Mean | 6.895376757 |
| Median Absolute Deviation (MAD) | 27.76627155 |
| Skewness | 3.71681748 |
| Sum | 2257174 |
| Variance | 1992.130727 |
| Value | Count | Frequency (%) | |
| -13 | 7177 | 2.2% | |
| -10 | 7088 | 2.2% | |
| -12 | 7046 | 2.2% | |
| -14 | 6975 | 2.1% | |
| -11 | 6863 | 2.1% | |
| -9 | 6815 | 2.1% | |
| -15 | 6796 | 2.1% | |
| -7 | 6677 | 2.0% | |
| -17 | 6668 | 2.0% | |
| -8 | 6663 | 2.0% | |
| Other values (567) | 258578 | 79.0% |
| Value | Count | Frequency (%) | |
| -86 | 1 | < 0.1% | |
| -79 | 1 | < 0.1% | |
| -75 | 2 | < 0.1% | |
| -74 | 1 | < 0.1% | |
| -73 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 1272 | 1 | < 0.1% | |
| 1127 | 1 | < 0.1% | |
| 1109 | 1 | < 0.1% | |
| 1007 | 1 | < 0.1% | |
| 989 | 1 | < 0.1% |
carrier
Categorical
| Distinct count | 16 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.5 MiB |
| UA | |
|---|---|
| B6 | |
| EV | |
| DL | |
| AA | |
| Other values (11) |
| Value | Count | Frequency (%) | |
| UA | 57782 | 17.7% | |
| B6 | 54049 | 16.5% | |
| EV | 51108 | 15.6% | |
| DL | 47658 | 14.6% | |
| AA | 31947 | 9.8% | |
| MQ | 25037 | 7.6% | |
| US | 19831 | 6.1% | |
| 9E | 17294 | 5.3% | |
| WN | 12044 | 3.7% | |
| VX | 5116 | 1.6% | |
| Other values (6) | 5480 | 1.7% |
Length
| Max length | 2 |
|---|---|
| Mean length | 2 |
| Min length | 2 |
| Value | Count | Frequency (%) | |
| Uppercase_Letter | 17 | 89.5% | |
| Decimal_Number | 2 | 10.5% |
| Value | Count | Frequency (%) | |
| Latin | 17 | 89.5% | |
| Common | 2 | 10.5% |
| Value | Count | Frequency (%) | |
| ASCII | 19 | 100.0% |
flight
Real number (ℝ≥0)
| Distinct count | 3835 |
|---|---|
| Unique (%) | 1.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1943.104501 |
|---|---|
| Minimum | 1 |
| Maximum | 8500 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 2.5 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 87 |
| Q1 | 544 |
| median | 1467 |
| Q3 | 3412 |
| 95-th percentile | 4689 |
| Maximum | 8500 |
| Range | 8499 |
| Interquartile range (IQR) | 2868 |
Descriptive statistics
| Standard deviation | 1621.523684 |
|---|---|
| Coefficient of variation (CV) | 0.8345015324 |
| Kurtosis | -0.7907590658 |
| Mean | 1943.104501 |
| Median Absolute Deviation (MAD) | 1377.983526 |
| Skewness | 0.6930968219 |
| Sum | 636067486 |
| Variance | 2629339.057 |
| Value | Count | Frequency (%) | |
| 15 | 956 | 0.3% | |
| 27 | 886 | 0.3% | |
| 181 | 875 | 0.3% | |
| 301 | 852 | 0.3% | |
| 161 | 780 | 0.2% | |
| 695 | 756 | 0.2% | |
| 1109 | 709 | 0.2% | |
| 745 | 697 | 0.2% | |
| 1 | 697 | 0.2% | |
| 359 | 694 | 0.2% | |
| Other values (3825) | 319444 | 97.6% |
| Value | Count | Frequency (%) | |
| 1 | 697 | 0.2% | |
| 2 | 51 | < 0.1% | |
| 3 | 628 | 0.2% | |
| 4 | 391 | 0.1% | |
| 5 | 324 | 0.1% |
| Value | Count | Frequency (%) | |
| 8500 | 1 | < 0.1% | |
| 6181 | 80 | < 0.1% | |
| 6180 | 6 | < 0.1% | |
| 6177 | 160 | < 0.1% | |
| 6168 | 2 | < 0.1% |
| Distinct count | 4037 |
|---|---|
| Unique (%) | 1.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.5 MiB |
| N725MQ | 544 |
|---|---|
| N722MQ | 485 |
| N723MQ | 475 |
| N711MQ | 462 |
| N713MQ | 449 |
| Other values (4032) |
| Value | Count | Frequency (%) | |
| N725MQ | 544 | 0.2% | |
| N722MQ | 485 | 0.1% | |
| N723MQ | 475 | 0.1% | |
| N711MQ | 462 | 0.1% | |
| N713MQ | 449 | 0.1% | |
| N258JB | 420 | 0.1% | |
| N353JB | 403 | 0.1% | |
| N298JB | 402 | 0.1% | |
| N351JB | 391 | 0.1% | |
| N328AA | 389 | 0.1% | |
| Other values (4027) | 322926 | 98.6% |
Length
| Max length | 6 |
|---|---|
| Mean length | 5.995179413 |
| Min length | 5 |
| Value | Count | Frequency (%) | |
| Uppercase_Letter | 24 | 70.6% | |
| Decimal_Number | 10 | 29.4% |
| Value | Count | Frequency (%) | |
| Latin | 24 | 70.6% | |
| Common | 10 | 29.4% |
| Value | Count | Frequency (%) | |
| ASCII | 34 | 100.0% |
origin
Categorical
| Distinct count | 3 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.5 MiB |
| EWR | |
|---|---|
| JFK | |
| LGA |
| Value | Count | Frequency (%) | |
| EWR | 117127 | 35.8% | |
| JFK | 109079 | 33.3% | |
| LGA | 101140 | 30.9% |
Length
| Max length | 3 |
|---|---|
| Mean length | 3 |
| Min length | 3 |
| Value | Count | Frequency (%) | |
| Uppercase_Letter | 9 | 100.0% |
| Value | Count | Frequency (%) | |
| Latin | 9 | 100.0% |
| Value | Count | Frequency (%) | |
| ASCII | 9 | 100.0% |
| Distinct count | 104 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.5 MiB |
| ATL | 16837 |
|---|---|
| ORD | 16566 |
| LAX | 16026 |
| BOS | 15022 |
| MCO | 13967 |
| Other values (99) |
| Value | Count | Frequency (%) | |
| ATL | 16837 | 5.1% | |
| ORD | 16566 | 5.1% | |
| LAX | 16026 | 4.9% | |
| BOS | 15022 | 4.6% | |
| MCO | 13967 | 4.3% | |
| CLT | 13674 | 4.2% | |
| SFO | 13173 | 4.0% | |
| FLL | 11897 | 3.6% | |
| MIA | 11593 | 3.5% | |
| DCA | 9111 | 2.8% | |
| Other values (94) | 189480 | 57.9% |
Length
| Max length | 3 |
|---|---|
| Mean length | 3 |
| Min length | 3 |
| Value | Count | Frequency (%) | |
| Uppercase_Letter | 26 | 100.0% |
| Value | Count | Frequency (%) | |
| Latin | 26 | 100.0% |
| Value | Count | Frequency (%) | |
| ASCII | 26 | 100.0% |
| Distinct count | 509 |
|---|---|
| Unique (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 150.6864602 |
|---|---|
| Minimum | 20 |
| Maximum | 695 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 2.5 MiB |
Quantile statistics
| Minimum | 20 |
|---|---|
| 5-th percentile | 40 |
| Q1 | 82 |
| median | 129 |
| Q3 | 192 |
| 95-th percentile | 339 |
| Maximum | 695 |
| Range | 675 |
| Interquartile range (IQR) | 110 |
Descriptive statistics
| Standard deviation | 93.68830466 |
|---|---|
| Coefficient of variation (CV) | 0.6217433506 |
| Kurtosis | 0.8630769908 |
| Mean | 150.6864602 |
| Median Absolute Deviation (MAD) | 72.7175711 |
| Skewness | 1.070705186 |
| Sum | 49326610 |
| Variance | 8777.49843 |
| Value | Count | Frequency (%) | |
| 42 | 2552 | 0.8% | |
| 43 | 2543 | 0.8% | |
| 41 | 2513 | 0.8% | |
| 45 | 2495 | 0.8% | |
| 40 | 2466 | 0.8% | |
| 44 | 2444 | 0.7% | |
| 39 | 2411 | 0.7% | |
| 47 | 2409 | 0.7% | |
| 46 | 2406 | 0.7% | |
| 109 | 2377 | 0.7% | |
| Other values (499) | 302730 | 92.5% |
| Value | Count | Frequency (%) | |
| 20 | 2 | < 0.1% | |
| 21 | 14 | < 0.1% | |
| 22 | 34 | < 0.1% | |
| 23 | 82 | < 0.1% | |
| 24 | 103 | < 0.1% |
| Value | Count | Frequency (%) | |
| 695 | 1 | < 0.1% | |
| 691 | 1 | < 0.1% | |
| 686 | 2 | < 0.1% | |
| 683 | 1 | < 0.1% | |
| 679 | 1 | < 0.1% |
| Distinct count | 213 |
|---|---|
| Unique (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1048.371314 |
|---|---|
| Minimum | 80 |
| Maximum | 4983 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 2.5 MiB |
Quantile statistics
| Minimum | 80 |
|---|---|
| 5-th percentile | 199 |
| Q1 | 509 |
| median | 888 |
| Q3 | 1389 |
| 95-th percentile | 2475 |
| Maximum | 4983 |
| Range | 4903 |
| Interquartile range (IQR) | 880 |
Descriptive statistics
| Standard deviation | 735.9085231 |
|---|---|
| Coefficient of variation (CV) | 0.7019540821 |
| Kurtosis | 1.14911845 |
| Mean | 1048.371314 |
| Median Absolute Deviation (MAD) | 568.1345654 |
| Skewness | 1.113392621 |
| Sum | 343180156 |
| Variance | 541561.3544 |
| Value | Count | Frequency (%) | |
| 2475 | 11159 | 3.4% | |
| 762 | 10041 | 3.1% | |
| 733 | 8507 | 2.6% | |
| 2586 | 8109 | 2.5% | |
| 544 | 5961 | 1.8% | |
| 719 | 5828 | 1.8% | |
| 187 | 5773 | 1.8% | |
| 1096 | 5702 | 1.7% | |
| 2454 | 5646 | 1.7% | |
| 944 | 5429 | 1.7% | |
| Other values (203) | 255191 | 78.0% |
| Value | Count | Frequency (%) | |
| 80 | 48 | < 0.1% | |
| 94 | 895 | 0.3% | |
| 96 | 598 | 0.2% | |
| 116 | 412 | 0.1% | |
| 143 | 418 | 0.1% |
| Value | Count | Frequency (%) | |
| 4983 | 342 | 0.1% | |
| 4963 | 359 | 0.1% | |
| 3370 | 8 | < 0.1% | |
| 2586 | 8109 | 2.5% | |
| 2576 | 309 | 0.1% |
| Distinct count | 19 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 13.14100982 |
|---|---|
| Minimum | 5 |
| Maximum | 23 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 2.5 MiB |
Quantile statistics
| Minimum | 5 |
|---|---|
| 5-th percentile | 6 |
| Q1 | 9 |
| median | 13 |
| Q3 | 17 |
| 95-th percentile | 20 |
| Maximum | 23 |
| Range | 18 |
| Interquartile range (IQR) | 8 |
Descriptive statistics
| Standard deviation | 4.662062914 |
|---|---|
| Coefficient of variation (CV) | 0.354772044 |
| Kurtosis | -1.206908044 |
| Mean | 13.14100982 |
| Median Absolute Deviation (MAD) | 4.053830471 |
| Skewness | 0.01154287952 |
| Sum | 4301657 |
| Variance | 21.73483062 |
| Value | Count | Frequency (%) | |
| 8 | 26734 | 8.2% | |
| 6 | 25447 | 7.8% | |
| 17 | 23667 | 7.2% | |
| 15 | 23082 | 7.1% | |
| 7 | 22475 | 6.9% | |
| 16 | 22045 | 6.7% | |
| 18 | 21072 | 6.4% | |
| 14 | 21022 | 6.4% | |
| 19 | 20507 | 6.3% | |
| 9 | 19931 | 6.1% | |
| Other values (9) | 101364 | 31.0% |
| Value | Count | Frequency (%) | |
| 5 | 1940 | 0.6% | |
| 6 | 25447 | 7.8% | |
| 7 | 22475 | 6.9% | |
| 8 | 26734 | 8.2% | |
| 9 | 19931 | 6.1% |
| Value | Count | Frequency (%) | |
| 23 | 1042 | 0.3% | |
| 22 | 2558 | 0.8% | |
| 21 | 10503 | 3.2% | |
| 20 | 16061 | 4.9% | |
| 19 | 20507 | 6.3% |
| Distinct count | 60 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 26.2341162 |
|---|---|
| Minimum | 0 |
| Maximum | 59 |
| Zeros | 58924 |
| Zeros (%) | 18.0% |
| Memory size | 2.5 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 8 |
| median | 29 |
| Q3 | 44 |
| 95-th percentile | 58 |
| Maximum | 59 |
| Range | 59 |
| Interquartile range (IQR) | 36 |
Descriptive statistics
| Standard deviation | 19.29591774 |
|---|---|
| Coefficient of variation (CV) | 0.7355276464 |
| Kurtosis | -1.234587472 |
| Mean | 26.2341162 |
| Median Absolute Deviation (MAD) | 16.60165774 |
| Skewness | 0.09257147917 |
| Sum | 8587633 |
| Variance | 372.3324414 |
| Value | Count | Frequency (%) | |
| 0 | 58924 | 18.0% | |
| 30 | 33033 | 10.1% | |
| 45 | 19871 | 6.1% | |
| 15 | 18365 | 5.6% | |
| 55 | 18290 | 5.6% | |
| 59 | 15817 | 4.8% | |
| 10 | 14135 | 4.3% | |
| 25 | 14030 | 4.3% | |
| 5 | 13690 | 4.2% | |
| 29 | 13453 | 4.1% | |
| Other values (50) | 107738 | 32.9% |
| Value | Count | Frequency (%) | |
| 0 | 58924 | 18.0% | |
| 1 | 2085 | 0.6% | |
| 2 | 818 | 0.2% | |
| 3 | 1381 | 0.4% | |
| 4 | 1322 | 0.4% |
| Value | Count | Frequency (%) | |
| 59 | 15817 | 4.8% | |
| 58 | 1038 | 0.3% | |
| 57 | 1335 | 0.4% | |
| 56 | 1665 | 0.5% | |
| 55 | 18290 | 5.6% |
| Distinct count | 6922 |
|---|---|
| Unique (%) | 2.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.5 MiB |
| 20-09-2013 08:00 | 94 |
|---|---|
| 23-09-2013 08:00 | 93 |
| 16-09-2013 08:00 | 92 |
| 09-09-2013 08:00 | 92 |
| 19-09-2013 08:00 | 92 |
| Other values (6917) |
| Value | Count | Frequency (%) | |
| 20-09-2013 08:00 | 94 | < 0.1% | |
| 23-09-2013 08:00 | 93 | < 0.1% | |
| 16-09-2013 08:00 | 92 | < 0.1% | |
| 09-09-2013 08:00 | 92 | < 0.1% | |
| 19-09-2013 08:00 | 92 | < 0.1% | |
| 10-09-2013 08:00 | 91 | < 0.1% | |
| 23-10-2013 08:00 | 91 | < 0.1% | |
| 09-10-2013 08:00 | 91 | < 0.1% | |
| 18-09-2013 08:00 | 91 | < 0.1% | |
| 21-10-2013 08:00 | 90 | < 0.1% | |
| Other values (6912) | 326429 | 99.7% |
Length
| Max length | 16 |
|---|---|
| Mean length | 16 |
| Min length | 16 |
| Value | Count | Frequency (%) | |
| Decimal_Number | 10 | 76.9% | |
| Other_Punctuation | 1 | 7.7% | |
| Dash_Punctuation | 1 | 7.7% | |
| Space_Separator | 1 | 7.7% |
| Value | Count | Frequency (%) | |
| Common | 13 | 100.0% |
| Value | Count | Frequency (%) | |
| ASCII | 13 | 100.0% |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
First rows
| df_index | year | month | day | dep_time | sched_dep_time | dep_delay | arr_time | sched_arr_time | arr_delay | carrier | flight | tailnum | origin | dest | air_time | distance | hour | minute | time_hour | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0 | 2013 | 1 | 1 | 517.0 | 515 | 2.0 | 830.0 | 819 | 11.0 | UA | 1545 | N14228 | EWR | IAH | 227.0 | 1400 | 5 | 15 | 01-01-2013 05:00 |
| 1 | 1 | 2013 | 1 | 1 | 533.0 | 529 | 4.0 | 850.0 | 830 | 20.0 | UA | 1714 | N24211 | LGA | IAH | 227.0 | 1416 | 5 | 29 | 01-01-2013 05:00 |
| 2 | 2 | 2013 | 1 | 1 | 542.0 | 540 | 2.0 | 923.0 | 850 | 33.0 | AA | 1141 | N619AA | JFK | MIA | 160.0 | 1089 | 5 | 40 | 01-01-2013 05:00 |
| 3 | 3 | 2013 | 1 | 1 | 544.0 | 545 | -1.0 | 1004.0 | 1022 | -18.0 | B6 | 725 | N804JB | JFK | BQN | 183.0 | 1576 | 5 | 45 | 01-01-2013 05:00 |
| 4 | 4 | 2013 | 1 | 1 | 554.0 | 600 | -6.0 | 812.0 | 837 | -25.0 | DL | 461 | N668DN | LGA | ATL | 116.0 | 762 | 6 | 0 | 01-01-2013 06:00 |
| 5 | 5 | 2013 | 1 | 1 | 554.0 | 558 | -4.0 | 740.0 | 728 | 12.0 | UA | 1696 | N39463 | EWR | ORD | 150.0 | 719 | 5 | 58 | 01-01-2013 05:00 |
| 6 | 6 | 2013 | 1 | 1 | 555.0 | 600 | -5.0 | 913.0 | 854 | 19.0 | B6 | 507 | N516JB | EWR | FLL | 158.0 | 1065 | 6 | 0 | 01-01-2013 06:00 |
| 7 | 7 | 2013 | 1 | 1 | 557.0 | 600 | -3.0 | 709.0 | 723 | -14.0 | EV | 5708 | N829AS | LGA | IAD | 53.0 | 229 | 6 | 0 | 01-01-2013 06:00 |
| 8 | 8 | 2013 | 1 | 1 | 557.0 | 600 | -3.0 | 838.0 | 846 | -8.0 | B6 | 79 | N593JB | JFK | MCO | 140.0 | 944 | 6 | 0 | 01-01-2013 06:00 |
| 9 | 9 | 2013 | 1 | 1 | 558.0 | 600 | -2.0 | 753.0 | 745 | 8.0 | AA | 301 | N3ALAA | LGA | ORD | 138.0 | 733 | 6 | 0 | 01-01-2013 06:00 |
Last rows
| df_index | year | month | day | dep_time | sched_dep_time | dep_delay | arr_time | sched_arr_time | arr_delay | carrier | flight | tailnum | origin | dest | air_time | distance | hour | minute | time_hour | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 327336 | 336760 | 2013 | 9 | 30 | 2211.0 | 2059 | 72.0 | 2339.0 | 2242 | 57.0 | EV | 4672 | N12145 | EWR | STL | 120.0 | 872 | 20 | 59 | 30-09-2013 20:00 |
| 327337 | 336761 | 2013 | 9 | 30 | 2231.0 | 2245 | -14.0 | 2335.0 | 2356 | -21.0 | B6 | 108 | N193JB | JFK | PWM | 48.0 | 273 | 22 | 45 | 30-09-2013 22:00 |
| 327338 | 336762 | 2013 | 9 | 30 | 2233.0 | 2113 | 80.0 | 112.0 | 30 | 42.0 | UA | 471 | N578UA | EWR | SFO | 318.0 | 2565 | 21 | 13 | 30-09-2013 21:00 |
| 327339 | 336763 | 2013 | 9 | 30 | 2235.0 | 2001 | 154.0 | 59.0 | 2249 | 130.0 | B6 | 1083 | N804JB | JFK | MCO | 123.0 | 944 | 20 | 1 | 30-09-2013 20:00 |
| 327340 | 336764 | 2013 | 9 | 30 | 2237.0 | 2245 | -8.0 | 2345.0 | 2353 | -8.0 | B6 | 234 | N318JB | JFK | BTV | 43.0 | 266 | 22 | 45 | 30-09-2013 22:00 |
| 327341 | 336765 | 2013 | 9 | 30 | 2240.0 | 2245 | -5.0 | 2334.0 | 2351 | -17.0 | B6 | 1816 | N354JB | JFK | SYR | 41.0 | 209 | 22 | 45 | 30-09-2013 22:00 |
| 327342 | 336766 | 2013 | 9 | 30 | 2240.0 | 2250 | -10.0 | 2347.0 | 7 | -20.0 | B6 | 2002 | N281JB | JFK | BUF | 52.0 | 301 | 22 | 50 | 30-09-2013 22:00 |
| 327343 | 336767 | 2013 | 9 | 30 | 2241.0 | 2246 | -5.0 | 2345.0 | 1 | -16.0 | B6 | 486 | N346JB | JFK | ROC | 47.0 | 264 | 22 | 46 | 30-09-2013 22:00 |
| 327344 | 336768 | 2013 | 9 | 30 | 2307.0 | 2255 | 12.0 | 2359.0 | 2358 | 1.0 | B6 | 718 | N565JB | JFK | BOS | 33.0 | 187 | 22 | 55 | 30-09-2013 22:00 |
| 327345 | 336769 | 2013 | 9 | 30 | 2349.0 | 2359 | -10.0 | 325.0 | 350 | -25.0 | B6 | 745 | N516JB | JFK | PSE | 196.0 | 1617 | 23 | 59 | 30-09-2013 23:00 |